Implement parallelization options as explicit inputs. #554

greschd · 2020-09-18T15:58:44Z

Fixes #405.

If any parallelization option is already specified in the settings['CMDLINE'], a DeprecationWarning is issued. However, the value from settings['CMDLINE'] still takes precedence. This allows us to implement the defaults at the level of the input
port, without having to care about it later on.

The BasePwCpInputGenerator is given three class attributes:

_ALLOWED_PARALLELIZATION_FLAGS is a list of tuples (flag_name, default, help), of all possible flags
_ENABLED_PARALLELIZATION_FLAGS is a list of flag names that are implemented in a particular code.
_PARALLELIZATION_FLAG_ALIASES is used to detect all possible variations on a flag name.

TODO:

Testing
Documentation: Not sure where best to put this - the auto-generated help will be there in any case.
Check if flag names are case sensitive. If not, this needs to be taken into account when checking for existing flags. Update: If flags are capitalized differently (e.g. -nPools) QE will simply ignore them.
Set _ENABLED_PARALLELIZATION_FLAGS for codes other than pw.x. Help needed: I have no idea what flags other codes support.
PW accepts -nimage, but I'm not sure if passing that makes any sense (a PW calculation only treats one "image"). If it doesn't, we should probably disable nimage in the PwCalculation. Update: Disallowed -nimage for pw.x for now - if there is in fact a use case, we can always re-enable it.

aiida_quantumespresso/calculations/__init__.py

greschd · 2020-09-18T22:06:13Z

Side note: I noticed the # yapf: disable statement in the define classmethod applies to the whole file after that, not just define. I doubt that's intentional, but should probably be fixed in a separate PR such as to not clutter the changes.

greschd · 2020-09-18T22:50:55Z

@sphuber this is ready for review, but not quite for merging yet.

greschd · 2020-09-22T14:12:29Z

Aliases are taken from https://gitlab.com/QEF/q-e/-/blob/develop/Modules/command_line_options.f90#L93

The choice of which alias is used in the interface is somewhat arbitrary - now it's the one that matches the internal variable name in QE.

sphuber · 2020-10-08T21:17:33Z

Side note: I noticed the # yapf: disable statement in the define classmethod applies to the whole file after that, not just define. I doubt that's intentional, but should probably be fixed in a separate PR such as to not clutter the changes.

Is that a bug in yapf or intended behavior? I thought those statements only applied to the scope to which they were applied. So I assumed that was only being applied to the define method. How did you discover this? And where should it be placed then? On the line of method definition?

sphuber · 2020-10-08T21:23:32Z

aiida_quantumespresso/calculations/__init__.py

+ # in QE codes. The flags that are actually implemented in a given
+ # code should be specified in the '_ENABLED_PARALLELIZATION_FLAGS'
+ # list of each calculation subclass.
+ _ALLOWED_PARALLELIZATION_FLAGS = ( # yapf: disable


Two things:

Find the prefix "allowed" a bit weird. Why not just _PARALLELIZATION_FLAGS?

Why not make this a dictionary with the flag name as the key? I think you only use it in line 125 and there you transform it in a dict. Might as well define it as one then and save those lines

Yeah good point, will change that.

sphuber · 2020-10-08T21:29:08Z

aiida_quantumespresso/calculations/__init__.py

+ # bind to the name `default`, and look up its value only when the lambda is
+ # actually called. Because `default` changes throughout the for-loop, this
+ # would give the wrong value.
+ extra_kwargs = {'default': partial(orm.Int, default)}


Not sure about specifying these defaults. I think this would mean that all these calculations will now always have a bunch of int nodes as inputs even if the user doesn't specify anything. I am not sure we want this, at least not until we have proper automatic node deduplication in the database. If we still need the defaults to be set for consistency, we can maybe just do this on prepare_for_submission of the base class, but without going through input nodes

I added these defaults mostly for caching: Two calculations could otherwise have different inputs but be functionally equivalent. They wouldn't use caching, just because the inputs are different.

I'll leave it up to you to decide if that's an important reason - I can also see the argument that the extra input nodes are unnecessary. Otherwise I can just remove the defaults.

Hmm, didn't think about the caching. Then again, maybe if we miss a cache hit because of the parallelization flags is not too bad as it should not occur too often. That reminds me though: didn't you mention somewhere that the actual defaults in QE are a bit surprising, i.e. not all 1 or something like that?

The QE defaults are 1 for all inputs except ndiag - that's why we don't have a default for that here. QE determines it based on the number of available processors and some of the other parallelization levels.

Should I remove the defaults for the other parallelization options, then?

Should I remove the defaults for the other parallelization options, then?

I think you have already done this haven't you? All defaults in _PARALLELIZATION_FLAGS are set to None which means that required=False will be set on the corresponding port. This means that no node will be generated unless one is passed explicitly of course, which is the behavior I was suggesting. So i think this now respects that, or am I missing something?

Right, I'm asking whether the code implementing defaults should also be removed. Essentially, the default is not None code path is dead now.

the default is not None code path is dead now.

Not quite. A subclass could still override _PARALLELIZATION_FLAGS to define default for one or more flags and automatically get the spec to work correctly. Not sure if this would be useful, but could be the case. So maybe keeping the code is ok. That being said, what is this MappingProxyType and why do you use that to create the class attribute?

Ok, fine with me to leave it.

The MappingProxyType is an immutable wrapper around a dictionary. If nobody has a reference to the underlying dictionary, the object as a whole is immutable (at least, unless someone goes digging deep for a way to change it).

Here I want it to be immutable because otherwise it's easy to mess up the base class by modifying _PARALLELIZATION_FLAGS on a child class.

sphuber · 2020-10-08T21:32:21Z

tests/calculations/test_pw.py

+ inputs['settings'] = Dict(dict={'CMDLINE': ['-{}'.format(flag_name), '2']})
+ with pytest.warns(DeprecationWarning):
+ generate_calc_job(fixture_sandbox, entry_point_name, inputs)


maybe still check here that the value of the settings is kept and not overridden?

greschd · 2020-10-08T21:45:10Z

Thanks for the review @sphuber, I'll make the changes tomorrow. Do you think we should aim to get this into the release?

sphuber · 2020-10-08T21:51:56Z

Thanks for the review @sphuber, I'll make the changes tomorrow. Do you think we should aim to get this into the release?

Honestly, since this is a bit of an experiment, would it maybe make sense to not include it so that we have it in develop for some time and can get others to test it to see if they like it?

greschd · 2020-10-08T22:59:53Z

Sure, I don't mind either way.

greschd · 2020-10-09T08:23:30Z

Side note: I noticed the # yapf: disable statement in the define classmethod applies to the whole file after that, not just define. I doubt that's intentional, but should probably be fixed in a separate PR such as to not clutter the changes.

Is that a bug in yapf or intended behavior? I thought those statements only applied to the scope to which they were applied. So I assumed that was only being applied to the define method. How did you discover this? And where should it be placed then? On the line of method definition?

I'm not sure, but I'd lean towards a yapf bug. I discovered it when adding a # yapf: enable statement somewhere else (reversing the effect of that disable statement), and it ended up reformatting a large part of the file.

greschd · 2020-10-09T08:45:10Z

Alright, I've addressed the comments above. One thing to still decide is whether we want to keep the ability to set defaults on the parallelization options.

If not, I would remove the corresponding code. I think it's still prudent to give the parallelization options set via settings['CMDLINE'] higher precedence though, but only because it's difficult to reliably strip them, since e.g. ['-ndiag', '', '2'] is also technically valid.

Note that I've removed the py3.5 test because we're targeting post-release, so I went with py3.6 compatibility only. I will rebase after we removed py3.5 compatibility.

aiida_quantumespresso/calculations/__init__.py

giovannipizzi · 2020-10-15T09:22:14Z

A few comments on my side:

ok to keep the settings higher priority for now; however I'm not sure ['-ndiag', '', '2'] is equivalent, when creating the submissions script we escape everything so I think that QE really gets an empty string for the value after -ndiag (if then fortran is weird and skips it, I don't know)
is there a strong reason to use different ports (in a namespace) instead of a parallelisation Dict input node, and do validation later? I agree having multiple ports helps for having defaults, docs etc., but exactly because every code might be using different parallelisation options, and these might change between QE versions (see e.g. -npools, -nks, -nk are (or used to be?) all equivalent I think), isn't it better to just put the complex logic in a single python file?

greschd · 2020-10-15T09:47:29Z

A few comments on my side:

* ok to keep the settings higher priority for now; however I'm not sure `['-ndiag', '', '2']` is equivalent, when creating the submissions script we escape everything so I think that QE really gets an empty string for the value after `-ndiag` (if then fortran is weird and skips it, I don't know)

Hm yeah, I have to admit I didn't test this. The point I was trying to make is that I'm not sure I'd get "stripping out" the explicitly passed parameters right, exactly because it's not very clear how these edge cases are handled.

* is there a strong reason to use different ports (in a namespace) instead of a `parallelisation` Dict input node, and do validation later? I agree having multiple ports helps for having defaults, docs etc., but exactly because every code might be using different parallelisation options, and these might change between QE versions (see e.g. `-npools`, `-nks`, `-nk` are (or used to be?) all equivalent I think), isn't it better to just put the complex logic in a single python file?

No particularly strong reason. In #405 I proposed both variants. The advantage of separate inputs is that it plays nicer with defaults:

it's not necessary to handle the defaults / missing parameters in code
works nicer with caching: explicitly specifying the default value is the same as not specifying anything
Since we now don't specify the defaults anyway, I'm not sure how much weight that argument carries.

The "equivalent aliases" are handled: Only one variant is allowed, see the _PARALLELIZATION_FLAG_ALIASES.

As far as different QE versions go: To be honest, I hadn't thought of that. But I think it wouldn't be too hard to add. Since we would still want to support "older" QE versions, at the most we will have to add new parallelization flags. We can always add a validation that checks if the used flags are compatible with the used QE version. We'd first have to introduce an input for passing the QE version, but that's the same for both variants.

If there's preference for the Dict version, I can switch that around, no problem.

greschd · 2020-11-09T17:07:44Z

If there's preference for the Dict version, I can switch that around, no problem.

@sphuber @giovannipizzi what's your preference here?

Honestly, since this is a bit of an experiment, would it maybe make sense to not include it so that we have it in develop for some time and can get others to test it to see if they like it?

Would soon-ish be a good time to push it into develop to get some exposure, or do you have a release planned soon?

sphuber · 2020-11-10T14:09:37Z

I am planning a release soon which is need for the common workflows project. Not sure if that should stop us though. I am not sure how many people are actually working of develop and are willing to experiment with it. I have to go now, but will try to come back to this later today and see what I think we should do

sphuber · 2020-11-10T16:51:26Z

I think going with the single Dict node is preferable for now:

I am not sure we can justify creating around 4 additional base type nodes for each PwCalculation. Especially by default.
We could still provide a default for the Dict node, specifying a dict with all flags set to 1 except ndiag because it is not necessarily 1. This would essentially solve the caching problem wouldn't it? The ndiag is more or less included indirectly through the options where the number of procs per cpu and number of machines are defined. I am not sure if this is fully correct, because the default number of procs is set on the computer and maybe can be changed. That all being said, I am not sure we should create this node by default. I am not sure how many people are actually specifying parallelization flags currently and so potentially we will be adding a lot of extra nodes. If that is just to help with the unlikely caching hit, I am not sure if that is worth it.
You mentioned that the individual nodes is also useful to automatically deal with default handling, but you still need to add code to the prepare_for_submission right? If we go to a single optional dict, you can simply add the handling code there, couldn't you? Or am I missing something.

TLDR: I think for now we should just be conservative and make a single optional Dict node called parallelization.

greschd · 2020-11-10T18:02:33Z

Alright, got it! I'll make a single Dict, but without a default. In that case we can also raise instead of just warning when it's specified explicitly and in the CMDLINE.

greschd · 2020-11-11T20:04:21Z

Alright, switched the implementation to a single Dict input. The main TODOs now are documentation (where should I put this?) and adding _ENABLED_PARALLELIZATION_FLAGS to other codes (I don't know which flags each code takes..).

greschd · 2020-11-18T22:24:09Z

Note to self: the (largely undocumented) -n_howmany flag is missing from this PR.

sphuber

Thanks @greschd , apart from some minor requests, this is ready to be merged. I don't think we need to specify the _ENABLED_PARALLELIZATION_FLAGS for any other plugins yet. We can check how this goes for pw.x and then apply elsewhere when needed and we figure out how it works. As for the docs, those are a bit of a mess. If you can add a section anywhere close to the PwCalculation for now, at some point hopefully we can work on the docs with @mbercx and we will incorporate it in the correct place then

aiida_quantumespresso/calculations/__init__.py

Implements the parallelization options as a single `Dict` given in the `parallelization` input. If any parallelization option is specified twice (either both in `parallization` and in `settings['CMDLINE']`, or more than once in `settings['CMDLINE']`), an `InputValidationError` is raised. If the flag is given only in `settings['CMDLINE']`, an `AiidaDeprecationWarning` is issued. If any parallelization option is already specified in the `settings['CMDLINE']`, a `AiidaDeprecationWarning` is issued. The `BasePwCpInputGenerator` is given three class attributes: - `_PARALLELIZATION_FLAGS` is a mapping `{flag_name: help}` of all possible flags. - `_ENABLED_PARALLELIZATION_FLAGS` is a tuple of flag names that are implemented in a particular code. - `_PARALLELIZATION_FLAG_ALIASES` is used to detect all possible variations on a flag name. These are taken from the QE source. When checking for existing parallelization flags in the manually passed cmdline parameters, these are normalized by splitting on whitespace. QE ignores flags that are capitalized differently, so we do not have to normalize capitalization here. Currently, `_ENABLED_PARALLELIZATION_FLAGS` is set only for PW calculations. Technically it accepts the `-nimage` flag, but we disallow it because (AFAIK) it doesn't make any sense - it just repeats the same calculation. Co-authored-by: Sebastiaan Huber <mail@sphuber.net>

greschd · 2020-11-20T09:57:27Z

For the documentation, I just added a description to the Inputs section in pw.rst - since this isn't all that complicated, I don't think it needs a dedicated section.

Without the `__init__.py`, `pylint` thinks that `types.py` is a stand-alone module, which would shadow the `types` standard library module.

greschd · 2020-11-20T10:09:33Z

Had to add an empty __init__.py to aiida_quantumespresso/common; otherwise types.py can shadow the standard library types module - at least in the eyes of pylint.

sphuber · 2020-11-23T14:51:28Z

THanks a lot @greschd

greschd · 2020-11-23T14:53:12Z

Thanks for reviewing @sphuber!

greschd · 2020-11-24T15:04:18Z

@sphuber I just noticed the squashed commit message for this PR is incorrect / outdated.

    `PwCalculation`: add explicit `parallelization` input port (#554)
    
    This new input port accepts a dictionary that allows one to specify the
    parallelization flags that `pw.x` accepts. This supersedes the old method
    of defining the flags in the `settings `input node. The latter is
    deprecated but still takes precedence if the same flag is specified
    through both input ports.

Specifically, we raise "if the same flag is specified through both input ports.". Just FYI, not sure if it makes sense to go back and modify the commit message - but we should probably make sure it's correct in an eventual changelog.

sphuber · 2020-11-25T08:56:12Z

So the only incorrect part is that it doesn't silently take the settings value in case of conflict but it actually raises? I think it is not that bad of a mistake and given that it has been in develop for a while I don't think I should correct the message and force push now

greschd · 2020-11-25T08:59:37Z

Yeah, makes sense - force-pushing to develop seems like a bit of a drastic measure. At the end of the day, there's always the possibility that old commit messages are now outdated.

greschd requested a review from sphuber September 18, 2020 15:58

greschd mentioned this pull request Sep 18, 2020

Promote parallelization settings to "proper" inputs? #405

Closed

greschd commented Sep 18, 2020

View reviewed changes

aiida_quantumespresso/calculations/__init__.py Outdated Show resolved Hide resolved

greschd force-pushed the parallelization_inputs branch 2 times, most recently from 122374f to 40aaf0e Compare September 18, 2020 22:04

greschd force-pushed the parallelization_inputs branch from 40aaf0e to e22998e Compare September 18, 2020 22:25

greschd force-pushed the parallelization_inputs branch 2 times, most recently from 1a41823 to 0f92866 Compare September 22, 2020 14:10

greschd force-pushed the parallelization_inputs branch from 0f92866 to 9a2fd50 Compare September 22, 2020 14:16

greschd force-pushed the parallelization_inputs branch 2 times, most recently from 2c1962c to 5e43d1e Compare October 8, 2020 10:36

sphuber reviewed Oct 8, 2020

View reviewed changes

greschd force-pushed the parallelization_inputs branch 2 times, most recently from dc05900 to 63a28c5 Compare October 9, 2020 08:39

greschd requested a review from sphuber October 9, 2020 08:45

greschd force-pushed the parallelization_inputs branch from 63a28c5 to 647581b Compare October 13, 2020 12:43

greschd commented Oct 13, 2020

View reviewed changes

aiida_quantumespresso/calculations/__init__.py Outdated Show resolved Hide resolved

greschd force-pushed the parallelization_inputs branch 3 times, most recently from 0c1e2ee to 333ce3c Compare October 13, 2020 14:17

greschd force-pushed the parallelization_inputs branch 2 times, most recently from 53516e4 to f8a6244 Compare November 11, 2020 20:01

sphuber requested changes Nov 19, 2020

View reviewed changes

aiida_quantumespresso/calculations/__init__.py Outdated Show resolved Hide resolved

aiida_quantumespresso/calculations/__init__.py Outdated Show resolved Hide resolved

aiida_quantumespresso/calculations/__init__.py Show resolved Hide resolved

greschd force-pushed the parallelization_inputs branch 3 times, most recently from b910b7f to d2edbd7 Compare November 20, 2020 09:54

greschd requested a review from sphuber November 20, 2020 09:55

greschd marked this pull request as ready for review November 20, 2020 09:56

greschd force-pushed the parallelization_inputs branch from d2edbd7 to a856b54 Compare November 20, 2020 09:56

Add __init__.py to 'aiida_quantumespresso/common'

23157ba

Without the `__init__.py`, `pylint` thinks that `types.py` is a stand-alone module, which would shadow the `types` standard library module.

greschd mentioned this pull request Nov 20, 2020

Re-enable formatting after define methods #619

Merged

sphuber approved these changes Nov 23, 2020

View reviewed changes

sphuber merged commit 4fabecc into aiidateam:develop Nov 23, 2020

greschd deleted the parallelization_inputs branch November 23, 2020 14:52

Implement parallelization options as explicit inputs. #554

Implement parallelization options as explicit inputs. #554

Conversation

greschd commented Sep 18, 2020 • edited Loading

greschd commented Sep 18, 2020

greschd commented Sep 18, 2020

greschd commented Sep 22, 2020

sphuber commented Oct 8, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

greschd Oct 13, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

greschd commented Oct 8, 2020

sphuber commented Oct 8, 2020

greschd commented Oct 8, 2020

greschd commented Oct 9, 2020 • edited Loading

greschd commented Oct 9, 2020

giovannipizzi commented Oct 15, 2020

greschd commented Oct 15, 2020

greschd commented Nov 9, 2020 • edited Loading

sphuber commented Nov 10, 2020

sphuber commented Nov 10, 2020

greschd commented Nov 10, 2020

greschd commented Nov 11, 2020

greschd commented Nov 18, 2020

sphuber left a comment • edited Loading

Choose a reason for hiding this comment

greschd commented Nov 20, 2020

greschd commented Nov 20, 2020 • edited Loading

sphuber commented Nov 23, 2020

greschd commented Nov 23, 2020

greschd commented Nov 24, 2020

sphuber commented Nov 25, 2020

greschd commented Nov 25, 2020

greschd commented Sep 18, 2020 •

edited

Loading

greschd Oct 13, 2020 •

edited

Loading

greschd commented Oct 9, 2020 •

edited

Loading

greschd commented Nov 9, 2020 •

edited

Loading

sphuber left a comment •

edited

Loading

greschd commented Nov 20, 2020 •

edited

Loading